Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-disks for PageStorage #1156

Merged
merged 10 commits into from
Nov 4, 2020

Conversation

JaySon-Huang
Copy link
Contributor

@JaySon-Huang JaySon-Huang commented Oct 15, 2020

What problem does this PR solve?

Issue Number: a part of #1128

Problem Summary: Currently, we can only use one path to store a PageStorage. It causes much more IO pressure(from Delta and Region snapshots) for the first disk than other disks when TiFlash is deployed in multi-paths mode. And support multi-paths for PageStorage can distribute the IO pressures between all disks.

Note
This PR only replace PageStorage used by the DT engine.
It doesn't replace the PageStorage which is used for storing Region Info. Will file another PR to do this.

What is changed and how it works?

Design doc:https://docs.google.com/document/d/1281wqqWNVj6LZMiWfcBPPQLOQN7w3koODPVLYWM6E68/edit#

  • Refactor class PathPool
    • PathPool: Only store the main_data_paths and latest_data_paths. Generate StoragePathPool for each storage
    • StoragePathPool: A class to manage paths for the specified storage
    • *-Delegator: Some delegator generated by StoragePathPool. They are used for managing the path for storing stable/delta/raft data
  • Refactor PageStorage and StorageDeltaMerge to adapt to new PathPool
  • The configuration "capacity" only accepts one quota instead of one quota for each path. The capacity quota limits the capacity size report to PD. See class PathCapacityMetrics for more details("capacity" is not recorded in public document and no user use it as we know)
  • Add new configuration "main_data_path", "latest_data_path" to replace "path", support multi-paths for storing delta data. The configuration "path" is deprecated.

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch 4.0

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)
    • Deploy a cluster with version v4.0.7, load TPC-H 1, upgrade TiFlash node, query TPC-H queries, All OK.
      • Then add new configuration "main_data_paths", load another TPC-H 1 dataset, the stable and delta data distribute among all disks
    • Deploy a TiFlash node with configuration "path"
    • Deploy a TiFlash node with configuration "main_data_paths"
    • Deploy a TiFlash node with configuration "main_data_paths" && "latest_data_path"

Side effects

  • Performance regression
    • Consumes more CPU
    • Consumes more MEM
  • Breaking backward compatibility

Release note

  • Add new configuration "main_data_path", "latest_data_path" to replace "path", support multi-paths for storing delta data

@JaySon-Huang JaySon-Huang added the type/enhancement The issue or PR belongs to an enhancement. label Oct 15, 2020
@JaySon-Huang JaySon-Huang self-assigned this Oct 15, 2020
@JaySon-Huang JaySon-Huang force-pushed the refactor_path_pool branch 3 times, most recently from d0455f8 to b0e4919 Compare October 19, 2020 08:45
@JaySon-Huang JaySon-Huang changed the title [WIP] Support multi-paths for PageStorage Support multi-paths for PageStorage Oct 19, 2020
@JaySon-Huang
Copy link
Contributor Author

/run-all-tests

@JaySon-Huang
Copy link
Contributor Author

/run-all-tests

@JaySon-Huang JaySon-Huang changed the title Support multi-paths for PageStorage [WIP] Support multi-paths for PageStorage Oct 23, 2020
// For clean read of column pk, version, tag, instead of loading data from disk, just create placeholder column is OK.
auto & cd = read_columns[i];
if (do_clean_read && isExtraColumn(cd))
try
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes in DMFileReader is adding try ... catch .... Add an error message to indicate which file is being read.

@JaySon-Huang JaySon-Huang force-pushed the refactor_path_pool branch 2 times, most recently from 81bbeba to 414b243 Compare October 26, 2020 11:17
@JaySon-Huang JaySon-Huang changed the title [WIP] Support multi-paths for PageStorage Support multi-paths for PageStorage Oct 27, 2020
Signed-off-by: JaySon-Huang <tshent@qq.com>
@JaySon-Huang JaySon-Huang added type/new-feature Issue or PR for new feature and removed type/enhancement The issue or PR belongs to an enhancement. labels Oct 27, 2020
@JaySon-Huang
Copy link
Contributor Author

/run-all-tests

@JaySon-Huang
Copy link
Contributor Author

/run-all-tests

@JaySon-Huang
Copy link
Contributor Author

addressed, PTAL again

Copy link
Contributor

@flowbehappy flowbehappy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Nov 4, 2020
@JaySon-Huang
Copy link
Contributor Author

/run-all-tests

@JaySon-Huang JaySon-Huang merged commit 09e342e into pingcap:master Nov 4, 2020
@JaySon-Huang JaySon-Huang deleted the refactor_path_pool branch November 4, 2020 05:39
@JaySon-Huang JaySon-Huang changed the title Support multi-paths for PageStorage Support multi-disks for PageStorage Nov 5, 2020
JaySon-Huang added a commit to JaySon-Huang/tiflash that referenced this pull request Nov 15, 2020
Signed-off-by: JaySon-Huang <tshent@qq.com>
JaySon-Huang added a commit that referenced this pull request Nov 15, 2020
* Support multi-disks for PageStorage (#1156)
* Support multi-disks for RegionPersister (#1199)
* Refactor storage configuration and strategies for choosing paths with multi-disks (#1216)

Signed-off-by: JaySon-Huang <tshent@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/LGT1 Indicates that a PR has LGTM 1. type/new-feature Issue or PR for new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants